fix: validate reviewAgentLogPath to prevent path injection by msukkari · Pull Request #1134 · sourcebot-dev/sourcebot

msukkari · 2026-04-18T01:57:33Z

Fixes SOU-930

Summary

This PR addresses CodeQL alert #19 (js/path-injection) by adding path validation to the invokeDiffReviewLlm function in the review agent.

Problem

The reviewAgentLogPath parameter was passed to fs.appendFileSync without validation, creating a latent path injection vulnerability. While the current code constructs the path safely using pullRequest.number (which is always an integer from the GitHub API), the function itself did not validate the path, meaning any future changes to the call chain could introduce a security issue.

Solution

Added a validateLogPath helper function that:

Resolves the provided path to an absolute path
Verifies the resolved path starts with the expected base directory (DATA_CACHE_DIR/review-agent/)
Throws an error if the path attempts to escape the log directory

The validation is performed before each fs.appendFileSync call to ensure defense-in-depth.

Additionally, the log directory constant is now exported from invokeDiffReviewLlm.ts and shared with app.ts to ensure a single source of truth.

Changes

Added path import to invokeDiffReviewLlm.ts
Added exported REVIEW_AGENT_LOG_DIR constant for the expected log directory
Added validateLogPath() helper function that validates paths are within the expected directory
Applied validation before both fs.appendFileSync calls (prompt and response logging)
Updated app.ts to import and use REVIEW_AGENT_LOG_DIR instead of defining it locally

Testing

All existing tests pass
Build compiles successfully
Lint checks pass

References

Linear Issue: SOU-930

Summary by CodeRabbit

Bug Fixes
- Ensured the review agent can only write logs inside its designated log directory; added path validation to prevent writing outside that directory.
- Logged a changelog "Fixed" entry documenting the path validation for review-agent log writing.
Documentation
- Updated PR description guidelines to auto-link Linear issues when a Linear ID is provided.

Add path validation to invokeDiffReviewLlm to ensure the log path stays within the expected DATA_CACHE_DIR/review-agent directory. This addresses CodeQL alert #19 (js/path-injection) by resolving the path and verifying it does not escape the log directory. The validation is performed before each fs.appendFileSync call to prevent path traversal attacks even if the call chain changes in the future. Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

coderabbitai · 2026-04-18T01:57:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e3ba9f07-a159-4621-bcef-2b89c15a0298

📥 Commits

Reviewing files that changed from the base of the PR and between 154b096 and 3ed9937.

📒 Files selected for processing (1)

CHANGELOG.md

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

Walkthrough

Log path handling for the review agent was hardened: log directory resolution was centralized via getReviewAgentLogDir(), and reviewAgentLogPath values are resolved and validated to prevent directory traversal before writing prompts or LLM responses. CHANGELOG and PR guidelines were also updated.

Changes

Cohort / File(s)	Summary
Review agent logging & validation `packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`	Added and exported `getReviewAgentLogDir()` and `validateLogPath()`; resolve and validate `reviewAgentLogPath` before writing prompt and before writing OpenAI response to prevent path traversal.
Log directory usage `packages/web/src/features/agents/review-agent/app.ts`	Replaced inline `path.join(env.DATA_CACHE_DIR, "review-agent")` with `getReviewAgentLogDir()` for existence check, recursive creation, and timestamped log path generation.
Changelog `CHANGELOG.md`	Added an “[Unreleased] → Fixed” entry documenting the path injection fix in the review agent’s log writing flow.
Docs – PR guidelines `CLAUDE.md`	Updated PR description guidance to require adding `Fixes <LinearIssueID>` at the top when a Linear issue ID is provided.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Proc as processGitHubPullRequest
  participant App as app.ts
  participant Agent as invokeDiffReviewLlm
  participant FS as Filesystem
  participant LLM as OpenAI

  Proc->>App: start processing PR
  App->>Agent: call process with PR data
  Agent->>Agent: dir = getReviewAgentLogDir()
  Agent->>FS: resolve(proposedLogPath)
  Agent->>Agent: validateLogPath(resolvedPath, dir)
  Agent->>FS: write prompt file (validated path)
  Agent->>LLM: send prompt -> await response
  LLM-->>Agent: response
  Agent->>FS: resolve(responseLogPath)
  Agent->>Agent: validateLogPath(resolvedResponsePath, dir)
  Agent->>FS: write response file (validated path)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly matches the main security fix: validating reviewAgentLogPath to prevent path injection vulnerability.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch cursor/fix-path-injection-codeql-05f4

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

…keDiffReviewLlm.ts Export the log directory constant from invokeDiffReviewLlm.ts and import it in app.ts to ensure a single source of truth for the review agent log directory path. Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts (1)

31-48: Minor: redundant double validation.

reviewAgentLogPath is a function parameter (a string) that doesn't change between the two validateLogPath calls, so the second invocation on Line 46 is redundant. Validating once near the top of invokeDiffReviewLlm (when reviewAgentLogPath is defined) would be simpler and still safe — there's no TOCTOU benefit here since the check is purely string-based.

♻️ Proposed refactor

     if (!env.OPENAI_API_KEY) {
         logger.error("OPENAI_API_KEY is not set, skipping review agent");
         throw new Error("OPENAI_API_KEY is not set, skipping review agent");
     }
+
+    if (reviewAgentLogPath) {
+        validateLogPath(reviewAgentLogPath);
+    }
     
     const openai = new OpenAI({
         apiKey: env.OPENAI_API_KEY,
     });
 
     if (reviewAgentLogPath) {
-        validateLogPath(reviewAgentLogPath);
         fs.appendFileSync(reviewAgentLogPath, `\n\nPrompt:\n${prompt}`);
     }
@@
         if (reviewAgentLogPath) {
-            validateLogPath(reviewAgentLogPath);
             fs.appendFileSync(reviewAgentLogPath, `\n\nResponse:\n${openaiResponse}`);
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`
around lines 31 - 48, The code calls validateLogPath(reviewAgentLogPath) twice
inside invokeDiffReviewLlm — once before appending the prompt and again before
appending the response — which is redundant because reviewAgentLogPath is an
immutable parameter; remove the second call. Update invokeDiffReviewLlm to
validate reviewAgentLogPath once (when reviewAgentLogPath is truthy) before the
first fs.appendFileSync, and keep the subsequent
fs.appendFileSync(reviewAgentLogPath, `\n\nResponse:\n${openaiResponse}`)
without re-validating; reference validateLogPath, reviewAgentLogPath,
invokeDiffReviewLlm, and fs.appendFileSync to locate the lines to change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`:
- Around line 10-17: Make REVIEW_AGENT_LOG_DIR an absolute canonical path by
using path.resolve on env.DATA_CACHE_DIR (e.g., REVIEW_AGENT_LOG_DIR =
path.resolve(env.DATA_CACHE_DIR, 'review-agent')), and update validateLogPath to
canonicalize both the base and the incoming logPath (use path.resolve) then
compute const rel = path.relative(REVIEW_AGENT_LOG_DIR, resolved); reject the
path when rel is empty, is absolute (path.isAbsolute(rel)), or startsWith('..')
and throw the existing error; reference symbols: REVIEW_AGENT_LOG_DIR and
validateLogPath.

---

Nitpick comments:
In `@packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`:
- Around line 31-48: The code calls validateLogPath(reviewAgentLogPath) twice
inside invokeDiffReviewLlm — once before appending the prompt and again before
appending the response — which is redundant because reviewAgentLogPath is an
immutable parameter; remove the second call. Update invokeDiffReviewLlm to
validate reviewAgentLogPath once (when reviewAgentLogPath is truthy) before the
first fs.appendFileSync, and keep the subsequent
fs.appendFileSync(reviewAgentLogPath, `\n\nResponse:\n${openaiResponse}`)
without re-validating; reference validateLogPath, reviewAgentLogPath,
invokeDiffReviewLlm, and fs.appendFileSync to locate the lines to change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d34658fd-f05a-4b07-abcf-0440067a55bf

📥 Commits

Reviewing files that changed from the base of the PR and between 2c89825 and 9b6e216.

📒 Files selected for processing (3)

CHANGELOG.md
packages/web/src/features/agents/review-agent/app.ts
packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts

msukkari · 2026-04-18T02:09:27Z

test

Change REVIEW_AGENT_LOG_DIR from a top-level constant to a getReviewAgentLogDir() function to avoid evaluating env.DATA_CACHE_DIR at module load time, which fails during Next.js build when environment variables are not yet available. Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

This reverts commit 476081e.

* feat(web): GitLab MR Review Agent Adds support for the AI Review Agent to review GitLab Merge Requests, mirroring the existing GitHub PR review functionality. Also fixes several bugs discovered during implementation and improves the shared review pipeline. --- ## New files ### `packages/web/src/features/agents/review-agent/nodes/gitlabMrParser.ts` Parses a GitLab MR webhook payload into the shared `sourcebot_pr_payload` format. Calls `MergeRequests.show()` and `MergeRequests.allDiffs()` in parallel — the API response is used for `title`, `description`, `sha`, and `diff_refs` (which can be absent in webhook payloads for `update` action events), while per-file diffs are parsed using the existing `parse-diff` library. ### `packages/web/src/features/agents/review-agent/nodes/gitlabPushMrReviews.ts` Posts review comments back to GitLab using `MergeRequestDiscussions.create()` with a position object carrying `base_sha`, `head_sha`, and `start_sha`. Falls back to `MergeRequestNotes.create()` (a general MR note) if the inline comment is rejected by the API (e.g. the line is not within the diff), ensuring reviews are always surfaced even when precise positioning fails. ### Test files (4 new files, 34 tests total) - `githubPrParser.test.ts` — diff parsing and metadata mapping for the GitHub parser - `githubPushPrReviews.test.ts` — single-line vs multi-line comment parameters, error resilience - `gitlabMrParser.test.ts` — API call arguments, metadata mapping, diff parsing, edge cases (empty diffs, nested groups, null description, API failures) - `gitlabPushMrReviews.test.ts` — inline comment posting, fallback behaviour, missing `diff_refs` guard, multi-file iteration --- ## Modified files ### `packages/web/src/features/agents/review-agent/types.ts` - Added `sourcebot_diff_refs` schema/type (`base_sha`, `head_sha`, `start_sha`) and an optional `diff_refs` field on `sourcebot_pr_payload` - Added `GitLabMergeRequestPayload` and `GitLabNotePayload` interfaces for webhook event typing ### `packages/web/src/features/agents/review-agent/app.ts` - Added `processGitLabMergeRequest()` function mirroring `processGitHubPullRequest()`: sets up logging, runs the GitLab parser, generates reviews via the shared LLM pipeline, and pushes results - Removed stale `OPENAI_API_KEY` guards (model availability is now enforced inside `invokeDiffReviewLlm`) ### `packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts` **Replaces the hardcoded OpenAI client** with the Vercel AI SDK's `generateText` and the shared `getAISDKLanguageModelAndOptions` / `getConfiguredLanguageModels` utilities from `chat/utils.server.ts`. The review agent now uses whichever language model is configured in `config.json`, supporting all providers (Anthropic, Bedrock, Azure, etc.). - `REVIEW_AGENT_MODEL` env var (matched against `displayName`) selects a specific model when multiple are configured; falls back to `models[0]` with a warning if the name is not found - Prompt is passed via the `system` parameter with a `"Review the code changes."` user turn, satisfying providers (e.g. Bedrock/Anthropic) that require conversations to begin with a user message ### `packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts` **Fixes "Not authenticated" error** when the review agent calls `getFileSource`. The original implementation used `withOptionalAuth`, which reads a session cookie — absent in webhook handlers. Now calls `getFileSourceForRepo` directly with `__unsafePrisma` and the single-tenant org, bypassing the session-based auth layer. The webhook handler has already authenticated the request via its own mechanism (GitHub App signature / GitLab token). ### `packages/web/src/features/git/getFileSourceApi.ts` - Extracted the core repo-lookup + git + language-detection logic into a new exported `getFileSourceForRepo({ path, repo, ref }, { org, prisma })` function - `getFileSource` now handles auth and audit logging then delegates to `getFileSourceForRepo` — all existing callers are unchanged ### `packages/web/src/app/api/(server)/webhook/route.ts` - Added GitLab webhook handling alongside the existing GitHub branch - Verifies `x-gitlab-token` against `GITLAB_REVIEW_AGENT_WEBHOOK_SECRET` - Handles `Merge Request Hook` events (auto-review on `open`, `update`, `reopen`) and `Note Hook` events (manual `/review` command on MR comments) - Initialises a `Gitlab` client at module load if `GITLAB_REVIEW_AGENT_TOKEN` is set ### `packages/web/src/app/(app)/agents/page.tsx` - Split the single "Review Agent" card into two separate cards: **GitHub Review Agent** and **GitLab Review Agent**, each showing its own configuration status - Removed `OPENAI_API_KEY` from the GitHub card's required env vars (no longer applicable) ### `packages/web/src/app/(app)/components/navigationMenu/navigationItems.tsx` & `index.tsx` - Added an **Agents** nav item (with `BotIcon`) between Repositories and Settings - Visible when the user is authenticated **and** at least one agent is configured (GitHub App triple or GitLab token pair), computed in the server component and passed down as `isAgentsVisible` ### `packages/shared/src/env.server.ts` Added four new environment variables: | Variable | Purpose | |---|---| | `GITLAB_REVIEW_AGENT_WEBHOOK_SECRET` | Verifies the `x-gitlab-token` header on incoming webhooks | | `GITLAB_REVIEW_AGENT_TOKEN` | Personal or project access token used for GitLab API calls | | `GITLAB_REVIEW_AGENT_HOST` | GitLab hostname (defaults to `gitlab.com`; set for self-hosted instances) | | `REVIEW_AGENT_MODEL` | `displayName` of the configured language model to use for reviews; falls back to the first model if unset or not matched | ### `packages/web/package.json` Added `@gitbeaker/rest` dependency (already used in `packages/backend`). --- ## Bug fixes | Bug | Fix | |---|---| | `"Not authenticated"` when fetching file content from the review agent | `fetchFileContent` now calls `getFileSourceForRepo` directly instead of `getFileSource` (which gates on session auth) | | `"diff_refs is missing"` when posting GitLab MR reviews | `gitlabMrParser` now fetches the full MR via `MergeRequests.show()` instead of relying on the webhook payload, which omits `diff_refs` on `update` events | | Bedrock/Anthropic rejection: `"A conversation must start with a user message"` | `invokeDiffReviewLlm` now passes the prompt via `system` + a `prompt` user turn instead of a `system`-role entry inside `messages` | | Review agent silently used `models[0]` with no way to specify a different model | New `REVIEW_AGENT_MODEL` env var selects by `displayName` | * Add CHANGELOG.md entry * Address coderabbit review comments * Update `review-agent` docs * Add `zod` schema to Gitlab events * Fix `repoName` calculation for Gitlab MRs * Add type annotation * Make sure that `GITLAB_REVIEW_AGENT_HOST` is valid hostname * Address final Coderabbit comments * Include the security fixes from #1134 --------- Co-authored-by: Gavin Williams <gavin.williams@getchip.uk> Co-authored-by: Brendan Kellam <brendan@sourcebot.dev>

cursoragent and others added 2 commits April 18, 2026 01:57

chore: add CHANGELOG entry for path injection fix

87894f8

Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

msukkari marked this pull request as ready for review April 18, 2026 02:04

docs: add guidance to link PRs to Linear issues in PR description

f887d91

Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts Outdated

cursoragent and others added 5 commits April 18, 2026 02:28

docs: add requirement to run tests and build before pushing

476081e

Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>

Revert "docs: add requirement to run tests and build before pushing"

c891f82

This reverts commit 476081e.

Merge branch 'main' into cursor/fix-path-injection-codeql-05f4

154b096

Merge branch 'main' into cursor/fix-path-injection-codeql-05f4

3ed9937

msukkari merged commit ddea515 into main Apr 18, 2026
8 checks passed

github-actions Bot mentioned this pull request Apr 18, 2026

Sourcebot Roadmap 🚀 #459

Open

fatmcgav pushed a commit to fatmcgav/sourcebot that referenced this pull request Apr 20, 2026

Include the security fixes from sourcebot-dev#1134

6af0c9d

fatmcgav pushed a commit to fatmcgav/sourcebot that referenced this pull request Apr 20, 2026

Include the security fixes from sourcebot-dev#1134

0944a80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: validate reviewAgentLogPath to prevent path injection#1134

fix: validate reviewAgentLogPath to prevent path injection#1134
msukkari merged 9 commits intomainfrom
cursor/fix-path-injection-codeql-05f4

msukkari commented Apr 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading

Reviews paused

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

msukkari commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

msukkari commented Apr 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

Testing

References

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

msukkari commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

msukkari commented Apr 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading